In this paper, we focus on the problem of feature learning in the presence of scale imbalance for 6-DoF grasp detection and propose a novel approach to especially address the difficulty in dealing with small-scale samples. A Multi-scale Cylinder Grouping (MsCG) module is presented to enhance local geometry representation by combining multi-scale cylinder features and global context. Moreover, a Scale Balanced Learning (SBL) loss and an Object Balanced Sampling (OBS) strategy are designed, where SBL enlarges the gradients of the samples whose scales are in low frequency by apriori weights while OBS captures more points on small-scale objects with the help of an auxiliary segmentation network. They alleviate the influence of the uneven distribution of grasp scales in training and inference respectively. In addition, Noisy-clean Mix (NcM) data augmentation is introduced to facilitate training, aiming to bridge the domain gap between synthetic and raw scenes in an efficient way by generating more data which mix them into single ones at instance-level. Extensive experiments are conducted on the GraspNet-1Billion benchmark and competitive results are reached with significant gains on small-scale cases. Besides, the performance of real-world grasping highlights its generalization ability. Our code is available at https://github.com/mahaoxiang822/Scale-Balanced-Grasp.
translated by 谷歌翻译
随着越来越多的监控系统已部署到智能城市,因此,将新的人类指定要求转换为机器可靠的正式规格的需求更高。但是,这些特定于人类的要求通常以英语编写,并带来丢失,不准确或模棱两可的信息。在本文中,我们提出了CitySpec,这是一个智能城市中的智能助理系统。CitySpec不仅有助于克服英语要求和正式规格带来的语言差异,而且还为缺失,不准确或模棱两可的信息提供了解决方案。本文的目的是展示CitySpec的工作原理。具体而言,我们提出了三个演示:(1)CitySpec中需求的交互式完成;(2)CitySePC遇到例外的人类校正;(3)在城市范围内的在线学习。
translated by 谷歌翻译
智能城市已经开发了越来越多的监视系统,以确保城市的实时操作满足安全性和绩效要求。但是,许多现有的城市要求是用英语编写的,缺少,不准确或模棱两可的信息。有很高的需求,可以协助城市政策制定者将人类指定的要求转换为用于监视系统的机器可理解的形式规格。为了应对这一限制,我们构建了CitySpec,这是第一个在智能城市进行需求规范的智能助理系统。为了创建CitySpec,我们首先收集来自100多个城市的不同领域的1,500多个现实世界的需求,并提取特定于城市的知识,以生成带有3,061个单词的城市词汇数据集。我们还构建了翻译模型并通过需求综合来增强它,并在不确定性下使用验证开发新颖的在线学习框架。现实世界中城市需求的评估结果表明,CitySpec将需求规范的句子级别的准确性从59.02%提高到86.64%,并且对新城市和新领域具有强大的适应性(例如,西雅图的F1分数提高了需求。通过在线学习从77.6%到93.75%)。
translated by 谷歌翻译
Recent studies have found that pain in infancy has a significant impact on infant development, including psychological problems, possible brain injury, and pain sensitivity in adulthood. However, due to the lack of specialists and the fact that infants are unable to express verbally their experience of pain, it is difficult to assess infant pain. Most existing infant pain assessment systems directly apply adult methods to infants ignoring the differences between infant expressions and adult expressions. Meanwhile, as the study of facial action coding system continues to advance, the use of action units (AUs) opens up new possibilities for expression recognition and pain assessment. In this paper, a novel AuE-IPA method is proposed for assessing infant pain by leveraging different engagement levels of AUs. First, different engagement levels of AUs in infant pain are revealed, by analyzing the class activation map of an end-to-end pain assessment model. The intensities of top-engaged AUs are then used in a regression model for achieving automatic infant pain assessment. The model proposed is trained and experimented on YouTube Immunization dataset, YouTube Blood Test dataset, and iCOPEVid dataset. The experimental results show that our AuE-IPA method is more applicable to infants and possesses stronger generalization ability than end-to-end assessment model and the classic PSPI metric.
translated by 谷歌翻译
In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization. Prior art has tackled each of these components individually, purportedly aiming to alleviate difficulties in effectively train a holistic network. We design a self-supervised image warping correspondence loss for both feature detection and matching, a weakly-supervised epipolar constraints loss on relative camera pose learning, and a directional matching scheme that detects key-point features in a source image and performs coarse-to-fine correspondence search on the target image. We leverage this framework to enforce cycle consistency in our matching module. In addition, we propose a new loss to robustly handle both definite inlier/outlier matches and less-certain matches. The integration of these learning mechanisms enables end-to-end training of a single network performing all three localization components. Bench-marking our approach on public data-sets, exemplifies how such an end-to-end framework is able to yield more accurate localization that out-performs both traditional methods as well as state-of-the-art weakly supervised methods.
translated by 谷歌翻译
Machine learning has emerged recently as a powerful tool for predicting properties of quantum many-body systems. For many ground states of gapped Hamiltonians, generative models can learn from measurements of a single quantum state to reconstruct the state accurately enough to predict local observables. Alternatively, kernel methods can predict local observables by learning from measurements on different but related states. In this work, we combine the benefits of both approaches and propose the use of conditional generative models to simultaneously represent a family of states, by learning shared structures of different quantum states from measurements. The trained model allows us to predict arbitrary local properties of ground states, even for states not present in the training data, and without necessitating further training for new observables. We numerically validate our approach (with simulations of up to 45 qubits) for two quantum many-body problems, 2D random Heisenberg models and Rydberg atom systems.
translated by 谷歌翻译
Early-exiting dynamic neural networks (EDNN), as one type of dynamic neural networks, has been widely studied recently. A typical EDNN has multiple prediction heads at different layers of the network backbone. During inference, the model will exit at either the last prediction head or an intermediate prediction head where the prediction confidence is higher than a predefined threshold. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. This brings a train-test mismatch problem that all the prediction heads are optimized on all types of data in training phase while the deeper heads will only see difficult inputs in testing phase. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. To mitigate this problem, we formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively. We name our method BoostNet. Our experiments show it achieves the state-of-the-art performance on CIFAR100 and ImageNet datasets in both anytime and budgeted-batch prediction modes. Our code is released at https://github.com/SHI-Labs/Boosted-Dynamic-Networks.
translated by 谷歌翻译
学习在线推荐模型的关键挑战之一是时间域移动,这会导致培训与测试数据分布之间的不匹配以及域的概括错误。为了克服,我们建议学习一个未来的梯度生成器,该生成器可以预测培训未来数据分配的梯度信息,以便可以对建议模型进行培训,就像我们能够展望其部署的未来一样。与批处理更新相比,我们的理论表明,所提出的算法达到了较小的时间域概括误差,该误差通过梯度变异项在局部遗憾中衡量。我们通过与各种代表性基线进行比较来证明经验优势。
translated by 谷歌翻译
在分布式深度学习的背景下,陈旧的权重或梯度的问题可能导致算法性能差。这个问题通常通过延迟耐受算法来解决,并在目标函数和步进尺寸上有一些温和的假设。在本文中,我们提出了一种不同的方法来开发一种新算法,称为$ \ textbf {p} $ redicting $ \ textbf {c} $ lipping $ \ textbf {a} $ synchronous $ \ textbf {s} textbf {g} $ radient $ \ textbf {d} $ escent(aka,pc-asgd)。具体而言,PC -ASGD有两个步骤 - $ \ textIt {预测步骤} $利用泰勒扩展利用梯度预测来减少过时的权重的稳固性,而$ \ textit {clivipping step} $选择性地降低了过时的权重,以减轻过时的权重他们的负面影响。引入权衡参数以平衡这两个步骤之间的影响。从理论上讲,考虑到平滑的物镜函数弱键和非凸的延迟延迟的延迟,我们介绍了收敛速率。还提出了一种实用的PC-ASGD变体,即采用条件来帮助确定权衡参数。对于经验验证,我们在两个基准数据集上使用两个深神经网络体系结构演示了该算法的性能。
translated by 谷歌翻译
体育游戏摘要旨在根据实时评论生成体育新闻。该任务吸引了广泛的研究关注,但由于缺乏相应的英语数据集,但仍未探索。因此,在本文中,我们发布了第一个英语体育游戏摘要数据集的目标。具体而言,目标有103个评论新对,评论和新闻的平均长度分别为2724.9和476.3个字。此外,为了支持半监督环境中的研究,目标还提供了2,160个未标记的评论文件。基于我们的目标,我们建立和评估了几个基线,包括提取性和抽象基线。实验结果表明,此任务的挑战仍然存在。我们希望我们的工作能够促进体育游戏总结的研究。该数据集已在https://github.com/krystalan/goal上发布。
translated by 谷歌翻译